gaussian process and bayesian optimization
Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization
Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians N(0,K) or "whitening" a vector b against covariance matrix K. While existing methods typically require O(N^3) computation, we introduce a highly-efficient quadratic-time algorithm for computing K^{1/2}b, K^{-1/2}b, and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically achieves 4 decimal places of accuracy with fewer than 100 MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as 50,000 by 50,000 - well beyond traditional methods - with little approximation error. Applying this increased scalability to variational Gaussian processes, Bayesian optimization, and Gibbs sampling results in more powerful models with higher accuracy. In particular, we perform variational GP inference with up to 10,000 inducing points and perform Gibbs sampling on a 25,000-dimensional problem.
Review for NeurIPS paper: Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization
Summary and Contributions: 3rd EDIT: With the notebook now attached to this submission, I am content with the quality of the empirical evaluation. For double precision, the Cholesky decomposition fails less often whereas MINRES needs more iterations to converge. Since this aspect is neither explored nor even mentioned, I recommend to reject this paper. EDIT: In their rebuttal the authors just brushed over my concern that the number of iterations is limited to J 200. It appears this parameter is far more crucial than the authors are willing to admit and reproducing the experiments turned out to be difficult.
Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization
Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians N(0,K) or "whitening" a vector b against covariance matrix K. While existing methods typically require O(N 3) computation, we introduce a highly-efficient quadratic-time algorithm for computing K {1/2}b, K {-1/2}b, and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically achieves 4 decimal places of accuracy with fewer than 100 MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as 50,000 by 50,000 - well beyond traditional methods - with little approximation error.
Financial Applications of Gaussian Processes and Bayesian Optimization
Gonzalvez, Joan, Lezmi, Edmond, Roncalli, Thierry, Xu, Jiali
In the last five years, the financial industry has been impacted by the emergence of digitalization and machine learning. In this article, we explore two methods that have undergone rapid development in recent years: Gaussian processes and Bayesian optimization. Gaussian processes can be seen as a generalization of Gaussian random vectors and are associated with the development of kernel methods. Bayesian optimization is an approach for performing derivative-free global optimization in a small dimension, and uses Gaussian processes to locate the global maximum of a black-box function. The first part of the article reviews these two tools and shows how they are connected. In particular, we focus on the Gaussian process regression, which is the core of Bayesian machine learning, and the issue of hyperparameter selection. The second part is dedicated to two financial applications. We first consider the modeling of the term structure of interest rates. More precisely, we test the fitting method and compare the GP prediction and the random walk model. The second application is the construction of trend-following strategies, in particular the online estimation of trend and covariance windows.
- Banking & Finance > Trading (1.00)
- Information Technology > Software (0.63)
- Energy > Oil & Gas > Upstream (0.46)